Non-Euclidean or Non-metric Measures Can Be Informative

نویسندگان

  • Elzbieta Pekalska
  • Artsiom Harol
  • Robert P. W. Duin
  • Barbara Spillmann
  • Horst Bunke
چکیده

Statistical learning algorithms often rely on the Euclidean distance. In practice, non-Euclidean or non-metric dissimilarity measures may arise when contours, spectra or shapes are compared by edit distances or as a consequence of robust object matching [1,2]. It is an open issue whether such measures are advantageous for statistical learning or whether they should be constrained to obey the metric axioms. The k-nearest neighbor (NN) rule is widely applied to general dissimilarity data as the most natural approach. Alternative methods exist that embed such data into suitable representation spaces in which statistical classifiers are constructed [3]. In this paper, we investigate the relation between non-Euclidean aspects of dissimilarity data and the classification performance of the direct NN rule and some classifiers trained in representation spaces. This is evaluated on a parameterized family of edit distances, in which parameter values control the strength of nonEuclidean behavior. Our finding is that the discriminative power of this measure increases with increasing non-Euclidean and non-metric aspects until a certain optimum is reached. The conclusion is that statistical classifiers perform well and the optimal values of the parameters characterize a non-Euclidean and somewhat non-metric measure.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

On Not Making Dissimilarities Euclidean

Non-metric dissimilarity measures may arise in practice e.g. when objects represented by sensory measurements or by structural descriptions are compared. It is an open issue whether such non-metric measures should be corrected in some way to be metric or even Euclidean. The reason for such corrections is the fact that pairwise metric distances are interpreted in metric spaces, while Euclidean d...

متن کامل

Spatial Analysis in curved spaces with Non-Euclidean Geometry

The ultimate goal of spatial information, both as part of technology and as science, is to answer questions and issues related to space, place, and location. Therefore, geometry is widely used for description, storage, and analysis. Undoubtedly, one of the most essential features of spatial information is geometric features, and one of the most obvious types of analysis is the geometric type an...

متن کامل

On the metric triangle inequality

A non-contradictible axiomatic theory is constructed under the local reversibility of the metric triangle inequality. The obtained notion includes the metric spaces as particular cases and the generated metric topology is T$_{1}$-separated and generally, non-Hausdorff.

متن کامل

Classification using non-standard metrics

A large variety of supervised or unsupervised learning algorithms is based on a metric or similarity measure of the patterns in input space. Often, the standard euclidean metric is not sufficient and much more efficient and powerful approximators can be constructed based on more complex similarity calculations such as kernels or learning metrics. This procedure is benefitial for data in euclide...

متن کامل

Using the Triangle Inequality to Reduce the Number of Comparisons Required for Similarity-Based Retrieval

Dissimilarity measures, the basis of similarity-based retrieval, can be viewed as a distance and a similarity-based search as a nearest neighbor search. Though there has been extensive research on data structures and search methods to support nearest-neighbor searching, these indexing and dimension-reduction methods are generally not applicable to non-coordinate data and non-Euclidean distance ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2006